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DETECTION OF LOCAL VISUAL SPACE-TIME DETAILS IN A VIDEO SIGNAL 



FIELD OF THE INVENTION 

The present invention relates to the field of video signal processing such as for TV 
or DVD signals. More specifically, the invention relates to methods for detection and 
segmentation of local visual space-time details in video signals. In addition, the invention 
relates to systems for detection and segmentation of local visual space-time details in video 
signals. 



BACKGROUND OF THE INVENTION 

Data compression of video signal with a stream of images (frames) has become 
widespread since a large amoimt of channel or storage capacity can be saved in transmission 
of digital video data such as for TV or DVD. Specified standards such as MPEG and H.26x 
provide a high degree of data compression using block-based motion compensation 
techniques. Normally, macro-blocks of 16x16 pixels are used for representation of motion 
information. For many normal video signals these compression techniques provide a high 
data compression rate without suffering from any visual artefact that is perceptible by the 
human eye. 

However, the standard compression schemes are known not to be transparent, i.e. for 
certain video signals they give rise to visual artefacts. Such visual artefacts occur in case the 
video signal includes motion pictures including local space-time details. Local space-time 
details are represented by spatial texture that varies its local characteristics in time in an 
indefinite manner. Examples are motion pictures of fire, wavy water, rising steam, leaves 
fluttering in the wind etc. In these cases the motion picture information representation by 
16x16 pixel macro-blocks offered by the compression schemes is too coarse to avoid loss of 
visual information. This is a problem in relation to achieve optimal high quality video 
reproduction in combination with the benefits of MPEG or H.26x compression with reject 
to bit rate reduction. 

In order to avoid visual artefacts in a video signal intended for compression, it is 
necessary to detect local space-trme details that may cause visual artefacts by compression 
prior to applying the compression procedure. Having located these parts in the video signal it 
is possible to apply a special processing to these parts so as to avoid artefacts being 
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introduced by the compression procedure. Methods for detecting and indicating image blocks 
of a video signal that include space-time details are known. 

EP 0 571 121 Bl describes an image processing method being an elaboration of the 
5 known so-called Hom-Schunk method. This method is described in B. K. Horn, and B. G. 
Schunck, "Determining Optical Flow", Artificial LiteUigence, Vol 17, 1981, pp. 185-204. 
The Hom-Schunk method includes extraction of pixel-wise image velocity information called 
optical flow. For each single image an optical flow vector is determined, and a condition 
nimiber is computed based on this vector. InEP0 571 121B1 a local condition number is 
10 computed based on the optical flow vector for each image, the goal being to obtain a robust 
optical flow. 

EP 1 233 373 Al describes a method for segmentation of fragments of an image 
exhibiting similarities in various visual attributes. Various criteria are described for 
15 combining small regions of an image into larger regions exhibiting similar characteristics 

wifliin a predetem^ned threshold. In relation to detection of motion an aflBne motion model is 
used which implies calculation of optical flow. 

US 6,456,731 Bl describes a method for estimation of optical flow and an image 
20 synthesis method. The described estimation of optical flow is based on the known Lucas- 
Kanade method described in B. D. Lucas, and T. Kanade, "An iterative image registration 
technique with an application to stereo vision". Proceedings of the 7th Intemational Joint 
Conference on Artificial Intelligence, 1981, Vancouver, pp. 674-679. The Lucas-Kanade 
method estimates optical flow by assiuning that optical flow is constant within a local 
25 neighbom-hood of a pixel. The image synthesis method is based on a process of registering 
consecutive images of a sequence by using values of estimated optical flow and a velocity of 
specifically tracked image points, visually saUent like comer points, xxsing the known so- 
called Tomasi-Kanade temporal feature tracking method. Thus, the method described in US 
5,456,73 1 Bl does not perform image partitioning, but similar to the method described in EP 
30 0 571 121 Bl, it performs the step of computing optical flow, and subsequently the step of 
image registering. 



SUMMARY OF THE INVENTION 

It may be seen as an object of the present invention to provide a method of detecting 
35 local space-time details in a video signal. The method must be simple to implement and it 
must be adapted for application within low cost equipment. By space-time details of an 
image is xmderstood image regions containing a large spatial brightness variation that exhibits 
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Strong temporal changes at liie local level, wherein a velocity of these spatial parts are weakly 
correlated in time. 

A first aspect of the present invention provides a method of detecting local space-time 
5 details of a video signal representing a pluraUty of images, the method comprising, for each 
image, the steps of: 

A) dividing the image into one or more blocks of pixels, 

B) calculating at least one space-time feature for at least one pixel within each of said one or 
more blocks, 

10 C) calculating for each of the one or more blocks at least one statistical parameter for each of 
the at least one space-time features calculated within the block, and 
D) detecting blocks wherein the at least one statistical parameter exceeds a predetermined 
level. 

15 Preferably, the at least one space-time feature comprises visual normal flow 

magnitude and/or visual nonnal flow direction. The visual normal flow represents the 
component of the optical flow that is parallel to image brightness spatial gradient. The at least 
one space-time feature may further comprise visual normal acceleration magnitude and^or 
visual normal acceleration direction. Visual normal acceleration describes temporal variation 

20 of the visual normal flow along the normal (image brightness gradient) direction. 

Preferably, the method further comprises the steps of calculating horizontal and 
vertical histograms of the at least one space-time feature calculated in step C). 

25 The at least one statistical parameter of step D) may comprise one or more of: 

variance, average, and at least one parameter of a probability function. The block(s) of pixels 
are preferably non-overlapping square blocks, and their size may be: 2x2 pixels, 4x4 pixels, 
6x6 pixels, 8x8 pixels, 12x12 pixels, or 16x16 pixels. 

30 The method may fiirfher comprise flie step of pre-processing the image prior to 

applying step A), so as to reduce noise in the image, this pre-processing preferably 
comprising the step of convolving the unage with a low-pass filter. 

The method may further comprise an intermediate step between step C) and D), the 
35 intermediate step comprising calculating at least one inter-block statistical parameter 

involving at least one of the statistical parameter calcvdated for each block. The at least one 
inter-block statistical parameter may be calculated using a 2-D Markovian non-causal 
neighbourhood stracture. 
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The method may further comprise tiie step of determining a pattern of temporal 
evolution for each of the at least one statistical parameter calculated in step C). The method 
may further comprise the step of indexing at least part of an image comprising one or more 
blocks detected in step D). Furthermore, the method may comprise the step of increasing data 
5 rate allocation to the one or more blocks detected in step D). In another embodiment, the 
method may further comprise the step of inserting an image in a de-interlacing system. 

A second aspect of the invention provides a system for detecting local space-time 
details of a video signal representing a plurality of images, the system comprising: 
10 - means for dividing an image into one or more blocks of pixels, 

- space-time feature calculating means for calculating at least one space-time feature for at 
least one pixel within each of the one or more blocks, 

- statistical parameter calculating means for calculating for each of the one or more blocks at 
least one statistical parameter xfor each of the at least one space-time features computed 

1 5 within the one or more blocks, and 

- detecting means for detecting one or more blocks wherein the at least one statistical 
parameter exceeds a predetemiined level. 

A third aspect of the invention provides a device comprising a system according to 
20 the system of the second aspect 

A fourth aspect of the invention provides a signal processor system programmed to 
operate according to the method of the first aspect. 

25 A fifth aspect of the uivention provides a de-interlacing system for a television (TV) 

apparatus, the de-interlacing system operating according to the method of the first aspect 

A sixth aspect provides a video signal encoder for encoding a video signal 
representing a plurality of images, the video signal encoder comprising: 
30 " means for dividing an image into one or more blocks of pixels, 

- space-time feature calculating means for calculating at least one space-time feature for at 
least one pixel within each of the one or more blocks, 

- statistical parameter calculating means for calculating for each of the one or more blocks at 
least one statistical parameter for each of the at least one space-time features computed 

35 within the one or more blocks, 

- means for allocating data to ttie one or more blocks according to a quantisation scale, and 

- means for adjusting the quantisation scale for the one or more blocks in accordance with the 
at least one statistical parameter. 
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A seventh aspect provides a video signal representing a plurality of images, the video 
signal comprising information regarding image segments exhibiting space-time details 
suitable for use with the method of the first aspect 



5 An eighth aspect provides a video storage medixmi comprising video signal data 

according to the seventh aspect. 

A ninth aspect provides a computer useable medium having a computer readable 
program code embodied therein, the computer readable program code comprising: 
10 - means for causing a computer to read a video signal representing a plurality of images, 

- means for causing the computer to divide a read image into one or more blocks of pixels, 

- means for causing the computer to calculate at least one space-time feature for at least one 
pixel within each block, 

- means for causing the computer to calculate for each of Ihe blocks at least one statistical 
15 parameter for each of the at least one space-time features calculated within the one or more 

blocks, and 

- means for causing the computer to detect blocks wherein the at least one statistical 
parameter exceeds a predetermined level. 

20 A tenth aspect provides a video signal representing a plurality of images, the video 

signal being compressed according to a video compression standard, such as MPEG or 
H.26x, comprising a specified individual allocation of data to blocks of each image, wherein 
a data rate allocated to one or more selected blocks of images exhibiting space-time details is 
increased compared to the specified allocation of data to the one or more selected blocks. 

25 

An eleventh aspect provides a method of processing a video signal, wherein the 
method of processing comprises the method of the first aspect. 

A twelflh aspect provides an integrated circuit comprising means for processing a 
30 video signal according to the method of the first aspect. 

A thirteenth aspect provides a program storage device readable by a machine and 
encoding a program of instructions for executing the method of the first aspect. 

35 BRIEF DESCRIPTION OF DRAWINGS 

In the following the invention is described in details with reference to the 
accompanjdng figures, wherein 
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Fig. 1 shows an illustration of normal and tangential flows at two points of a contour 
moving with uniform velocity. 

Fig. 2a shows an example of an image of two persons and a foimtain basin including 
5 splashing water, 

Fig. 2b shows a grey scale plot representing for the image of Fig. 2a a block-wise 
level of normal flow variance, wherein white blocks indicate blocks calculated to have a hi^ 
level of normal flow variance, 

10 

Fig. 3 shows a flow diagram of a system according to the present invention, and 
Fig. 4 shows an example of a normal flow variance histogram. 

15 While the invention is susceptible to various modifications and altemative forms, 

specific embodiments have been shown by way of example in the drawings and will be 
described in detail herein. It should be understood, however, that the invention is not 
intended to be limited to the particular forms disclosed. Rather, the invention is to cover all 
modifications, equivalents, and alternatives falling within the scope of the invention as 

20 defined by the appended claims. 



DETAILED DESCRIPTION OF THE INVENTION 

According to an embodiment of the present invention the major operations to be 
carried out for processing an image are the steps: 

25 

A) Divide image into blocks 

B) Estimate local feature(s) 

C) Calculate feature statistics per block 

30 Step A) of processing ^ irnsse if tC-dr±ie-4he image into blocks. Preferably, the 

blocks coincide with macro blocks used by standard compression such as MPEG and H.26x. 
Therefore, the image is pref erably divided into non- overlapping blocks of 8x8 pixels or 
16x16 pixels. The image blocks, when 8x8 pixels large and when they are aligned with the 
(MPEG) image grid, coincide with typical I-firame DCT/BDCT computation and describe 

35 spatial details information. When 16x16 pixels large and when they are aligned with the 
(MPEG) image grid, coincide with P-fi^ne (B-fi:ame) macro blocks for doing motion 
compensation (MC) in block-based motion estimation in MPEG/H.26x video standards, and 
this allows to describe spatio-temporal details information. 
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Step B) comprises estimating at least one local feature, the local feature relating to 
spatial, temporal, and/or spatio-temporal details of the image. Preferably, two features are 
used together with different associated metrics. The estimation of local features is based on a 
combination of spatial and temporal image brightness gradients. The preferred features are 
5 visual normal flow, i.e. and visual normal velocity and visual normal acceleration. The local 
feature may be based on either or both of visual normal velocity and visual normal 
acceleration. For the case of visual normal velocity two consecutive frames (or images) are 
used, while for the visual normal acceleration three consecutive frames (or images) are 
necessary. A more thorough description of visual normal velocity and visual normal 
10 acceleration is given in the following. 

Step C) comprises calculating a per block feature statistics. This includes tihie 
computation of feature average and variance. Also, different probability density functions are 
matched to this per block statistics. The per block statistics provides information so as to set 
15 up thresholds or criteria allowing a categorisation of each block with respect to the amoxmt of 
space-time details. Thxxs, the per block statistics allows detection of blocks with a high 
amount of space-time details, since such blocks exhibit per blocks statistical parameters 
exceeding predetermined thresholds. 

20 The visual normal flow represents the component of the optical flow that is parallel to 

image brightness spatial gradient. Optical flow is the most detailed velocity information that 
can be extracted locally by processing two successive frames or video fields, but it is 
computationally expensive to extract. The normal flow, on the other side, is easy to compute 
and it is very rich in local spatial and temporal information. For example calculation of 

25 optical flow requires typically 7x7x2 space-time neighbourhoods, while normal flow requires 
only 2x2x2 neighbourhoods. In addition, calculation of optical flow requires an optimisation, 
while calculation of normal flow does not. 

The normal flow magnitude determines the amount of motion parallel to the local 
30 image brightness gradient and the normal flow direction describes the local image brightness 
orientation. Visual normal flow is calculated from: 

^ ^dl{x^^^ Jl{x,y,t) ^ dl{x,y,t) _^ 
dx ^ dy dt ' 

where I is brightness, x and y are spatial variables, and t is the time variable. The normal flow 
35 direction encodes implicitly spatial variation of image brightness gradient and therefore 

spatial texture information. The normal acceleration describes, as a second order effect, how 
the normal flow varies locally. 
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Visual normal flow is defined as the normal, i.e. parallel to the spatial image gradient, 
component of the local image velocity or optical flow. The image velocity can be 
decomposed, at each image pixel, into normal and tangential components. 

Fig. 1 shows, for illustration, a well-defined image boundary or contour that passes 
the target pixel of an image. The diagram in Fig. 1 shows the normal and tangential flows at 
two points of a contour moving with uniform velocity V . Going firom point A to point B, the 
normal and tangential image velocities (normal flow and tangential flow, respectively) 
change their spatial orientation. This indeed h^pens firom point to point due to contour 
curvature. The normal and tangential flows are always 90° apart. 

An important property of tiie normal flow is that this in the only image velocity 
component that can be locally computed in the image. The tangential component can not be 
computed. In order to explain this, it can be assumed that the image brightness /(•,•;•) is 
constant when image ^mXP{x,y) at time/ moves to position P\x\y'') at time Ar = < + , 
were {x\ y') = {x,y)^-V ■ . The image velocity is considered to be constant and A^ is 
"small". Therefore, 



or 



Ii.x',y\t')»Iix,y,i) 



(1) 
(2) 



were' » ' 



means approximate and V ^ , ^ . Since f = f „ + and F; • V = 0 . 
(2) reduces to: 



(3) 



This means that: 



with 



dlix,y,t) 



dt 



mx,y,t) 



(4) 



(5) 



v/(v; ) 
|v/(v;0| 



(6) 
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The normal flow, in distinction to the image velocity, is also a measure of local image 
brightness gradient orientation, and this measures implicitly includes the amount of spatial 
shape variability, e.g. curvature, texture orientation, etc. 

5 Preferably, two different methods may be used to compute the normal flow in discrete 

images One method is the 2x2x2 brightness cube method is described in B.K.P. 
Hom, Robot Vision, The MIT Press, Cambridge, Massachusetts, 1986. Another method is the 
feature based method. 

10 In the 2x2x2 brightness cube method the spatial and temporal derivatives are 

approximated according to (7)-(9). 

^^^""'^'^^ « % X [(/[/ +/[z>1]|j1^ 

(7) 

15 (8) 
^^(^'^'%«)/x[(7[ri[[73[^:+l]+/[/] 

(9) 

20 These discrete derivatives are computed inside the cells of a 2x2x2 brightness cube. 

The feature based method is based on the following steps: 

(a) Finding image points with high spatial gradients. This is implemented by: (i) 
25 smoothing the image /(•,•;•) by appljdng to it a binomial approximation to a Gaussian 

function; (ii) computing the discretised spatial image gradients 

^%^«)^-{l[i + nUm-I[i-Wm) and 

dl^ « ^ . + l][Ar] - - 1][A:]); (iii) finding the subset of image points for 

which |V/0,-;-)| is larger than a pre-detennined threshold Tq^ . Also, use 

^® ^/dt " Yl ' ~ -^^IL/^f^ " » which involves three instead of two 

successive frames. 
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(b) The normal flow is computed interactively at each feature position, e.g. point with 
"high" spatial gradient, by using the discrete version of (5) and (6). First, with the 
initial computation of the normal flow, the local image is warped according to it to 
refine the normal flow value. From the residual temporal derivative the residual 
normal flow is computed and the initial normal flow estimate is updated. This is 
repeated imtil the residual normal flow is smaller than ^ (e.g. 0001). 

Normal acceleration describes temporal variation of the normal flow along the normal 
(image brightness gradient) direction. Its importance is due to the fact that the acceleration 
measures how much the normal flow varies between, at least three successive frames, and 
tiius making it enables to determine how much the space-time details vary between pairs of 
frames. 

One way to define the nonnal acceleration is by taking the temporal derivative of (3): 



(10) 



so that: 



and 



A=n\A„\, (11) 



|v/(x.3',or 



(12) 



Becaiise of the second temporal derivative in (12), it is necessary to use a min i mum of 
three successive frames whffla implementing (12). Taking a 3 x 3 x 3 pixels wide cube to 
compute the discretised versions of the derivatives in (12), it can be shown that 

(13) 



The other discretised derivatives can be obtained to (7)-(9) on the 3 x 3 x 3 cube. 
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The goal of computing feature statistics is to detect space-time regions were a given 
feature varies most ~ the segmentation and detection of high space-time details. This may be 
implemented according to the following algorithm, given two (three) successive images: 



5 1 . Dividing the image into non-overlapping (square or rectangular) blocks, 

2. Computing within each block a local feature set, 

3. Determining, for each block, the average of the feature set computed in 2., and 

4. Computing the variance, average variation of each feature within each block from the 
variance computed in 3., 

10 5, Given a threshold T,,^, , selecting a set of blocks for which the variance computed in 4. 
is larger thanr^,^^. 



In o\ir implementation of the algorithm we choose square (8x8 or 16x16) blocks. 
This will tessellate the image into square blocks, and the remainder of it will be left 

15 imtessellated; in order to reduce this residual imtessellated image region a rectangular 

tessellation could be used, but this is not so interesting because we want to aUgn these blocks 
with MPEG 8x8 (DCT) or 16x16 (MC) blocks for visual artefact pre-detection. The 
computation of feature values within each block is implemented either at each pixel, for 
which I V/(-, ;-)| is larger than a pre-determined threshold T , or at feature points for which 

20 lV/(-,-;-)| is larger than a pre-determined threshold T^^ ; usually T < T^^ . The statistics 
exemplified in steps 4. and 5. are just an illustration. More detailed statistics could be 
computed. Also, specific probability distribution densities (pdf) and their statistics could be 
computed. 

25 In order to make the computations according to the above-mentioned or related 

implementations more robust, a set of pre- and post-processing operations may be appUed. 
An example of pre-processing is to convolve the input images with low-pass filters. Post- 
processing may include, for example, comparing neighbour blocks with respect to their 
statistics, e.g. feature variance, 

30 

Fig. 2a shows an example of an image taken firom a sequence of images. In the image 
two persons are watching splashing water in a fountain basin. One of the persons is partly 
behind the splashing water. Such an image therefore includes local parts exhibiting an 
example of a phenomenon expected to produce a chaotic brightness pattern, namely the 
35 splashing water. Therefore, the image is taken from a moving image sequence with the 
potential of a high amoimt of local space-time details. The image has been processed 
according to the present invention in blocks, and for each block a variance of normal flow 
magnitude has been calculated as a measure representing the amount of space-time details. 
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In Fig. 2b the blocks of the image of Fig. 2a are shown in a grey scale indicating 
normal flow magnitude variance and thereby indicates the amount of local space-time details- 
White coloured blocks indicate regions with a high level of normal flow variance, whereas 
dark grey blocks indicate regions with a low level of normal flow variance. As seen from Fig. 
2b white blocks appear in parts of the image with splashing water and thus these local image 
regions are found to exhibit a large amoimt of local space-tune details according to the 
processing method. The steady image regions, such as the person to the left and the fountain 
basin to the right, are seen to be dark grey, indicating that these regions are detected to 
exhibit a low normal flow variance. 

Fig. 3 show a flow diagram structure of a system for processing space-time details 
information. The system sketched in Fig. 3 can be used for different appUcations by using 
different of paths A, B and C indicated in the flow diagram. The elements of Fig. 3 are: 

VI: Video Input 
Pre-P: Pre-processing 

STDE: Space-time detail estimation and detection 

Post-P: Post-processing 

VQI: Visual quality improvement 

Disp: Display 

St: Storage medium 

Video input of Fig. 3 represents a video signal representing a sequence of images. The 
video input may either be appUed directly, such as by a wire or wireless, or as indicated in 
Fig. 3, the video signal may be stored on a storage medium before being processed. The 
storage medium may be a hard disk, a writeable CD, a DVD, computer memory etc. Input 
may either be a compressed video format, such as MPEG or H.26x, or it may be a non- 
compressed signal, i.e. a full resolution representation of the video signal. If an analog video 
signal is input, the VI step may include an analog to digital conversion. 

Pre-processing of Fig. 3 is optional. If preferred, various signal processing may be 
applied in order to reduce noise or oflier visual artefacts in flie video signal before applying 
flie space-time detection processing. This enhances the effect of tiie space-time detection 
processing. 

Space-tune detail estimation and detection (STDE) is performed according to the 
above-described methods. Preferably the method includes calculation of visual normal flow 
and it may further include calculation of visual normal acceleration. The necessary 
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calculation means may be a dedicated video signal processors. Altematively, since the 
amount of calculations needed with the methods according to the present invention signal 
processing may be implemented using signal processing power akeady present in the device, 
such as a TV set or a DVD player. 

5 

Post-processing may include various per block statistical methods performed on 
statistical resxilts for each of the blocks of the STDE part of the system of Fig. 3. The post- 
processing may further include an integration in time of the statistical results for each of the 
blocks of the STDE step of Fig, 3. In addition, the post-processing may comprise determining 
10 a pattem of temporal evolution of the per block statistics in time. This is necessary to 
determine which parts have a stable statistics. 

Using path A of Fig. 3 the video signal is stored after detection of space-time details. 
Preferably, the video signal is stored together with indexing information allowing further 
1 5 processing to be performed later. 

Altematively, visual quality improvement means may be applied before storing, i.e. 
path B may be iised. Visual quality improvement means may be provided to tiie signal so as 
to utilise the provided information regarding local regions of images containing a large 

20 amount of space-time details. For a non-compressed video signal this may be done by 
allocatmg, to blocks with space-tune details, a larger data rate than would normally be 
allocated by standard coding schemes - for example by reducing the quantisation scale in I- 
frame and P-frame coding, to cope with higher levels of details. The signal may then be 
stored in an encoded version, however processed so as to eliminate or avoid visual artefacts. 

25 The video signal may be store without encoding but provided with indexing information 

mdicating blocks or regions with space-time details thus enabling further processing such as 
later encoding or using the space-time index information as a search criterion. 

The last processing part of the system of Fig. 3 is a visual ou^ut, i.e. display, such as 
30 on a TV screen, a computer screen etc. Altematively, the video signal may be applied to 
ftirther devices or processors before being displayed or stored. 

An application (i) of the principles according to the present invention is to eliminate 
or at least reduce visual artefacts in a video signal, such as the artefact blockiness or temporal 
35 flickering, by allocating more bits for blocks detected to exhibit space-time details. In some 
situations it may be preferred merely to obtain an indication of images/video regions which 
will contain probable visual artefacts, such as, blockiness, ringing, and mosquito "noise" for 
digitally (MPEG, H.26x) processed videos once encoded. 
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Another application (ii) is to implement a low cost motion detection indicator for field 
insertion in de-interlacing for TV systems that can profit fi-om a spatial sharpness 
improvement. This may be especially suitable for application within low cost de-interlacers, 
the principles according to the invention providing a partial motion compensation 
5 information. 

Yet another application (iii) is to detect, segment, index and retrieve image regions 
detected to exhibit space-time details in long video databases. In this way it may be possible 
to provide a search facility that allows a quick indexing of sequences of e.g. video films that 
10 contain waterfalls, ocean waves, hair/leaves/grass moving in the wind etc. Depending on 
which application is targeted, different processing blocks are used. 

Yet, another possible application (iv) is to perform selective sharpening, i.e. to 
adaptively change the spatial sharpness (peaking and clipping) to highlight selected regions 
15 of an image where a sharper image is desired, and to reduce the possibility of increasing the 
visibility of digital artefacts in regions that are de-selected. 

For example, application (i) can be used in both visxial quality improvements for 
display and storage applications. For display appUcation path C in Fig. 5 is used. Display 

20 applications may be such as high quality TV sets. Detection and segmentation of space-time 
details is important due to the fact that visual artefacts can be eliminated or at least reduced 
by an appropriate allocation of bits in response to local/regional image characteristics, such 
as, a customised bit-rate control per 8x8 or 16x16 image blocks. This is important relating to 
visual artefacts because often by just detecting may be too late to reduce their visibiUty or 

25 effects on the visual quality of motion pictures when displayed. 

In storage applications path A or path B of Fig. 5 may be used. By using path A the 
video signal is stored prior to performing visual quality improvement However, using path A 
may include detection and segmentation of space-time details and storage of indexing of 

30 regions, such as 8x8 or 16x16 pixel blocks, that contain a large amount of space-time details. 
In this way a long video databases (stored content) may be processed enabling a fiurther 
process at a later stage. This is useful for content information that is highly detailed and for 
which no effective representation is known for content description. Video signals may be 
stored either compressed or uncompressed. By storing uncompressed data a later 

35 compression can be performed taking advantage of the stored index relatmg to local space- 
time details. 



By using path B video signals are stored after being properly processed with respect 
to increasing visual quality based on the detected local space-time details. As mentioned, the 
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visual quality improvement could be performed by allocating more data to blocks exhibiting 
a space-time details. Therefore, path B may also be used for processing large video 
databases. Using path B video signals can be stored compressed since a proper signal 
treatment has been carried out ensxuing that a high visual quality regarding space-time details 
5 is obtained even by use of compression. 

Among a large amount of different devices or systems, parts of devices or systems, 
the principles according to the invention may be applied within TV systems, such as TV sets, 
and DVD+RW equipment, such as DVD players or DVD recorders. The proposed methods 
10 may be applied within digital (LCD, LCoS) TV sets where new types of digital artefacts 
occur and/or become more visible and thus requiring a generally high video signal quality. 

The principles of the present invention relating to visual quality improvement may be 
used also within wkeless hand-held miniature devices featuring displays adapted for showing 
15 motion pictures. For example, a high visual quality of motion pictures on mobile phones with 
near to the eye displays can be combined with still a moderate data rate requirement. For 
devices with a quite poor spatial resolution the visual quaUty improvements according to the 
invention may be used to reduce the required data rate for the video signal, and still without 
blockiness and related visual artefacts, 

20 

In addition, the principles according to the invention may be applied within MPEG 
coding and decoding equipment. The methods may be applied within such encoders or 
decoders. Alternatively, separate video processor devices may be applied prior to existing 
encoders. The principles according to the invention may be applied within consumer 
25 equipment as well as within professional equipment. 

In an embodiment of a video signal encoder according to the invention, a quantisation 
scale at the encoder side depending on space-time details information is applied. The 
quantisation scale is modulated by space-time details information. The smaller (larger) tiiis 
30 scale the more (less) steps the quantizer has, and therefore more (less) spatial details is 

enhanced (blurred). Preferably, a video signal encoder according to the invention is capable 
of producing signal formats in accordance with MPEG or H.26x formats. 

In a preferred embodiment, a fixed quantisation scale per macroblock q_sc is used. A 
3 5 modulation is applied to q_sc, wherein the modulation using information about space-time 
details. For each macroblock the normal flow (per pixel) and its average and variance 
(per macroblock) are calculated. From e^qperiments it is known that the nonnal flow variance 
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has a histogram for which the Gamma (Erlang) function is a good fit With this knowledge, it 
is possible to fit: 

M(x) = X X exp(-(j(: - 1)) 

5 (shifted Gamma (Erlang) fimction) to the histogram of cr^^ . With this, the quantisation scale 
per macroblock becomes: 

q _sc _m = F(S X g _sc - Xx MicT^)) , 

where FQ represents the operations of rounding and table look-up, and S and X are real 
10 numbers (positive for S and positive and negative for A ) that are adjusted according to an 
overall amount of bits preferred to assign per firame (video sequence). 

Fig. 4 shows an example of a histogram plotted for a sequence exhibiting image parts 
with a high amount of space-time details. The sequence processed is the sequence of a girl 

15 running in the foreground, while part of the background is the sea with water waves hitting 
rocks. The histogram of Fig. 4 shows a number of blocks as a function of normal flow 
variance. The white bars indicate flat areas, i.e. areas with a small amount of space-time 
details, e.g. the sky. The black bars indicate areas with a high amount of space-time details, 
e.g. water waves hitting the rocks. As seen from the histogram there is a good correlation 

20 between space-time details and normal flow variance, since bars representing areas with 
small amount of space-time details are grouped towards low normal flow variance values, 
while bars representing high amount of space-time details are grouped towards high normal 
flow variance values. 

25 In the foregomg, and also with regard to the accompanying claims, it will be 

appreciated that expressions such as "incorporate", "contain", "include", "comprise", "is" and 
"have" are intended to be construed non-exclusively, namely other parts or components are 
potentially present which have not been explicitly specified. 



